Japanese Kana-to-Kanji Conversion Using Large Scale Collocation Data

نویسندگان

  • Yasuo Koyama
  • Masako Yasutake
  • Kenji Yoshimura
  • Kosho Shudo
چکیده

Japanese wad prucessa. cr the cvmputer rated in Japaz employs, input method through keyboard vole canbinxIwith Kay Ohmetic) character b Kaiji (ickogrcphi4 Chime) cirraier aynersiattedsvlogy. .71r key fret►. of Karkto-Kanji co► tersion technology is how to rase the wary cfthe cantersicn hough the hamophae pvcwsirg we hate so many homcplvnes kits pcpet., we sprat the mass cf our Karr-taKayi canersicn experiments which embo* dr homcialme processing using catnsite colloartion daft It is shown that ciprzimately 135,000 °goo:0m dai2yields 9.1 %rnise cfie amtersicn axunory ccmparedwith the protoope .Dstan which ha rro collocatiatcbta

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Large Scale Collocation Data and Their Application to Japanese Word Processor Technology

Word processors or computers used in Japan employ Japanese input method through keyboard stroke combined with Kana (phonetic) character to Kanji (ideographic, Chinese) character conversion technology. The key factor of Kana-to-Kanji conversion technology is how to raise the accuracy of the conversion through the homophone processing, since we have so many homophonic Kanjis. In this paper, we re...

متن کامل

Using Collocations and K-means Clustering to Improve the N-pos Model for Japanese IME

Kana-Kanji conversion is known as one of the representative applications of Natural Language Processing (NLP) for the Japanese language. The N-pos model, presenting the probability of a Kanji candidate sequence by the product of bi-gram Part-of-Speech (POS) probabilities and POS-to-word emission probabilities, has been successfully applied in a number of well-known Japanese Input Method Editor ...

متن کامل

Discriminative Method for Japanese Kana-Kanji Input Method

The most popular type of input method in Japan is kana-kanji conversion, conversion from a string of kana to a mixed kanjikana string. However there is no study using discriminative methods like structured SVMs for kana-kanji conversion. One of the reasons is that learning a discriminative model from a large data set is often intractable. However, due to progress of recent researches, large sca...

متن کامل

Collocations as Word Co-ocurrence Restriction Data - An Application to Japanese Word Processor

Collocations, the combination of specific words are quite useful linguistic resources for NLP in general. The purpose of this paper is to show their usefulness, exemplifying an application to Kanji character decision processes for Japanese word processors. Unlike recent trials of automatic extraction, our collocations were collected manually through many years of intensive investigation of corp...

متن کامل

Kana-Kanji Conversion System with Input Support Based on Prediction

1 I n t r o d u c t i o n TOSHIBA developed the world's first Japanese word processor in 1978. Unlike languages based on an alphabet , Japanese uses /,housands of Ica nji characters of varying comp]exity. Hence, l,o arrange all of l~a'~:ii chm'acl;ers on keyboard is; difficult. On the other hand, kana dlaracters which are phonetic scripl,s of Japanese have 83 variations; these can be arranged o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998